Replace boolean feature flags with three-mode FeatureMode in placement shim#736
Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (26)
📝 WalkthroughWalkthroughThe pull request refactors the cortex-placement-shim from binary feature flags to per-endpoint feature modes ( Changes
Sequence DiagramsequenceDiagram
participant Client
participant Handler
participant Config
participant Upstream as Upstream API
participant CRD as Kubernetes CRD
Client->>Handler: HTTP Request to Endpoint
activate Handler
Handler->>Config: Check Features.EndpointName.orDefault()
activate Config
Config-->>Handler: Feature Mode (passthrough/hybrid/crd)
deactivate Config
alt Passthrough Mode
Handler->>Upstream: Forward Request
activate Upstream
Upstream-->>Handler: Response
deactivate Upstream
else Hybrid Mode
Handler->>Upstream: Forward & Fetch
Handler->>CRD: Query Kubernetes
activate CRD
CRD-->>Handler: K8s Data
deactivate CRD
Handler->>Handler: Merge Results
else CRD Mode
Handler->>CRD: Query Only Kubernetes
activate CRD
CRD-->>Handler: Data or 404
deactivate CRD
else Unknown Mode
Handler->>Handler: HTTP 500 Error
end
Handler-->>Client: Response
deactivate Handler
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@claude can you say hello world? |
|
Hello World! 👋 Thanks for the ping, @PhilippMatthes. Let me know if there's anything specific you'd like me to review or help with on this PR! |
|
@claude can you read through my changes and adapt the documentation under docs/ accordingly? You can check your changes directly onto this feature branch. |
📝 Documentation Update CompleteI've read through the PR changes and updated Checklist
Changes Made (commit 0cfa55f)Updated
|
|
@claude thanks, can you ensure there are no merge conflicts in docs/architecture.md with main? |
✅ Merge conflicts in docs/architecture.md resolved
What happenedPR #757 ( There were 3 merge conflicts, all in
ResolutionRebased the feature branch onto main ( The branch is now cleanly based on main with no merge conflicts. ✨ |
…t shim Introduce a FeatureMode type (passthrough, hybrid, crd) to replace the inconsistent boolean EnableResourceProviders/EnableRoot/EnableTraits flags. This enables gradual cutover from upstream Placement to CRD-only operation. Each endpoint group now has its own mode field in featuresConfig. Passthrough forwards to upstream as before, hybrid combines upstream with local CRD data, and crd operates without upstream dependency. Adds version intersection logic for hybrid root, a periodic trait sync routine for hybrid traits, and CRD-only resource provider listing. Endpoints not yet implementing hybrid/crd return 501. Updates Helm values, E2E tests, and unit tests accordingly.
Replace the boolean feature flag documentation with the new passthrough/hybrid/crd feature mode system. Document all 11 endpoint groups, explain mode semantics for resource providers (hybrid merge vs crd-only listing), root endpoint (version intersection in hybrid mode), and traits (periodic sync loop in hybrid mode). Also add the auth, passthrough diagram, and root endpoint sections that were missing from this branch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
0cfa55f to
a1af1a2
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/shim/placement/handle_traits_e2e.go (1)
20-38:⚠️ Potential issue | 🟡 MinorDoc comment still references the removed
enableTraitsflag.Lines 22–23 and 28 describe the phase gating in terms of
enableTraits, but the runtime check has moved toconfig.Features.Traits.orDefault() == FeatureModePassthrough. Worth updating the doc comment so it matches the new mode-based gating to avoid confusing future readers.📝 Proposed wording
-// Phase 1 — read-only (always runs): -// -// 1. GET /traits — list all traits; when forwarding to upstream (enableTraits -// is false) verify at least one trait exists. +// Phase 1 — read-only (always runs): +// +// 1. GET /traits — list all traits; when traits mode is passthrough (forwarding +// to upstream) verify at least one trait exists. @@ -// Phase 2 — CRUD (only when enableTraits is true): +// Phase 2 — CRUD (only when traits mode is non-passthrough):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shim/placement/handle_traits_e2e.go` around lines 20 - 38, Update the doc comment in handle_traits_e2e.go to remove the stale enableTraits wording and describe the mode-based gating: replace mentions of "enableTraits" with references to the Traits feature mode check (config.Features.Traits.orDefault() == FeatureModePassthrough), e.g. note that Phase 1 always runs and that GET /traits is forwarded to upstream when the Traits feature is in passthrough mode (FeatureModePassthrough), and that Phase 2 CRUD steps run only when the Traits feature mode is passthrough/managed as appropriate—use the symbol config.Features.Traits.orDefault() and FeatureModePassthrough in the wording so readers see the current runtime check instead of the removed enableTraits flag.
🧹 Nitpick comments (6)
internal/shim/placement/handle_traits_test.go (1)
68-110: Update staleenableTraits=…section headers.The boolean no longer exists; these section dividers should reference the new modes (
passthroughvscrd) so the file is self-explanatory.♻️ Suggested edit
-// --- Passthrough tests (enableTraits=false) --- +// --- Passthrough mode tests (Traits = FeatureModePassthrough) --- @@ -// --- Handler tests (enableTraits=true) --- +// --- Local handler tests (Traits = FeatureModeCRD) ---🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shim/placement/handle_traits_test.go` around lines 68 - 110, The file still uses outdated section header comments referencing enableTraits=true/false; update the comment lines above the passthrough tests and the following handler tests to use the new mode names (e.g., change the "// --- Passthrough tests (enableTraits=false) ---" comment to something like "// --- Passthrough tests (mode=passthrough) ---" and the "// --- Handler tests (enableTraits=true) ---" comment to "// --- Handler tests (mode=crd) ---") so the sections around TestHandleListTraitsPassthrough, TestHandleShowTraitPassthrough, TestHandleUpdateTraitPassthrough, TestHandleDeleteTraitPassthrough and the subsequent handler tests correctly reflect the current modes.internal/shim/placement/handle_allocations.go (1)
26-104: Extract the per-mode dispatch into a helper to avoid 4× duplication in this file.The same 7-line switch (
passthrough → forward,hybrid|crd → 501, default → 500) is repeated for each handler in this file, and the identical pattern appears inhandle_resource_classes.go(5 handlers),handle_resource_provider_aggregates.go(2 handlers), andhandle_usages.go. Centralising it keeps the per-endpoint behaviour consistent and shrinks each handler to a single line.♻️ Suggested helper (place alongside `FeatureMode` in `shim.go`)
// dispatchMode applies the standard per-endpoint mode policy: // passthrough forwards upstream; hybrid/crd return 501; unknown returns 500. func (s *Shim) dispatchMode(w http.ResponseWriter, r *http.Request, m FeatureMode) { switch m.orDefault() { case FeatureModePassthrough: s.forward(w, r) case FeatureModeHybrid, FeatureModeCRD: http.Error(w, fmt.Sprintf("%s mode is not yet implemented for this endpoint", m), http.StatusNotImplemented) default: http.Error(w, "unknown feature mode", http.StatusInternalServerError) } }Each handler then collapses to:
- switch s.config.Features.Allocations.orDefault() { - case FeatureModePassthrough: - s.forward(w, r) - case FeatureModeHybrid, FeatureModeCRD: - http.Error(w, fmt.Sprintf("%s mode is not yet implemented for this endpoint", s.config.Features.Allocations), http.StatusNotImplemented) - default: - http.Error(w, "unknown feature mode", http.StatusInternalServerError) - } + s.dispatchMode(w, r, s.config.Features.Allocations)The same one-line replacement applies to every other handler in
handle_resource_classes.go,handle_resource_provider_aggregates.go, andhandle_usages.go, and lets thefmtimport be dropped from those files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shim/placement/handle_allocations.go` around lines 26 - 104, The handlers duplicate the same switch logic for feature modes; add a helper method func (s *Shim) dispatchMode(w http.ResponseWriter, r *http.Request, m FeatureMode) placed alongside FeatureMode in shim.go that implements the shared switch (m.orDefault(): FeatureModePassthrough -> s.forward(w,r); FeatureModeHybrid/FeatureModeCRD -> http.Error 501 with fmt message using m; default -> http.Error 500). Then update each handler (e.g., HandleManageAllocations, HandleListAllocations, HandleUpdateAllocations, HandleDeleteAllocations) to replace the switch with a single call s.dispatchMode(w, r, s.config.Features.Allocations), and remove now-unused fmt imports from the files where applicable.internal/shim/placement/handle_resource_providers_test.go (1)
886-980: Rename stale "FeatureFlagOff" terminology to match the new mode model.The mapping
EnableResourceProviders=false → FeatureModePassthroughis correct, but the surrounding identifiers and messages still talk about "flag off". With the boolean removed, this is now misleading and asymmetric with the rest of the file (where the default shim at Line 102 usesFeatureModeHybridwithout any "flag on" naming).♻️ Suggested rename
// --------------------------------------------------------------------------- -// Feature flag tests +// Passthrough mode tests // --------------------------------------------------------------------------- -func TestHandleResourceProviders_FeatureFlagOff(t *testing.T) { +func TestHandleResourceProviders_Passthrough(t *testing.T) { hv1Obj := &hv1.Hypervisor{ ObjectMeta: metav1.ObjectMeta{Name: "hv-flagtest"}, Status: hv1.HypervisorStatus{HypervisorID: validUUID}, } - newFlagOffShim := func(t *testing.T, upstreamStatus int, upstreamBody string) *Shim { + newPassthroughShim := func(t *testing.T, upstreamStatus int, upstreamBody string) *Shim {…and update the call sites and the
"flag off should forward, not 409"messages accordingly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shim/placement/handle_resource_providers_test.go` around lines 886 - 980, The test and helper still use "flag off" wording but the code now uses modes; rename TestHandleResourceProviders_FeatureFlagOff to TestHandleResourceProviders_PassthroughMode (or similar), rename helper newFlagOffShim to newPassthroughShim, and replace all "flag off" messages (e.g. "flag off should forward, not 409") with wording that references Passthrough/FeatureModePassthrough (e.g. "passthrough mode should forward, not 409"); update the comment header and any other identifiers or log strings that mention "flag off" to use FeatureModePassthrough so names and messages consistently reflect the mode model and reference the existing FeatureModePassthrough, TestHandleResourceProviders_FeatureFlagOff, and newFlagOffShim symbols when making the edits.internal/shim/placement/handle_reshaper.go (1)
26-35: Consider extracting the mode-dispatch into a shared helper.The same 4-arm switch (
passthrough → forward,hybrid|crd → 501, default → 500) is now duplicated across all 11 endpoint groups (handle_allocation_candidates.go, handle_resource_provider_inventories.go ×6, handle_resource_provider_traits.go ×3, handle_resource_provider_allocations.go, handle_resource_provider_usages.go, etc.). A small helper onShimwould centralize the “hybrid/crd not yet implemented” boilerplate so future per-endpoint logic only needs to override the one or two arms it actually implements, and would also guarantee the 501 message stays in sync withorDefault().♻️ Sketch of the helper
// dispatchPassthroughOnly forwards in passthrough mode, returns 501 for // hybrid/crd, and 500 for unknown modes. Use this for endpoints whose // hybrid/crd behavior is not implemented yet. func (s *Shim) dispatchPassthroughOnly(w http.ResponseWriter, r *http.Request, mode FeatureMode) { switch mode { case FeatureModePassthrough: s.forward(w, r) case FeatureModeHybrid, FeatureModeCRD: http.Error(w, fmt.Sprintf("%s mode is not yet implemented for this endpoint", mode), http.StatusNotImplemented) default: http.Error(w, "unknown feature mode", http.StatusInternalServerError) } }Call sites then collapse to:
- switch s.config.Features.Reshaper.orDefault() { - case FeatureModePassthrough: - s.forward(w, r) - case FeatureModeHybrid, FeatureModeCRD: - http.Error(w, fmt.Sprintf("%s mode is not yet implemented for this endpoint", s.config.Features.Reshaper), http.StatusNotImplemented) - default: - http.Error(w, "unknown feature mode", http.StatusInternalServerError) - } + s.dispatchPassthroughOnly(w, r, s.config.Features.Reshaper.orDefault())As a side benefit, this drops the
fmtimport from each handler file and ensures the 501 message reflects the resolved mode (orDefault()) rather than the raw, possibly-empty field value.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shim/placement/handle_reshaper.go` around lines 26 - 35, Extract the duplicated mode-dispatch into a Shim helper named dispatchPassthroughOnly that accepts (w http.ResponseWriter, r *http.Request, mode FeatureMode) and implements the 3-way switch: call s.forward(w,r) for FeatureModePassthrough, return http.StatusNotImplemented with a message including the resolved mode for FeatureModeHybrid|FeatureModeCRD, and return http.StatusInternalServerError for default; then replace the switch in HandlePostReshaper (and the other handlers) with a single call to s.dispatchPassthroughOnly(w, r, s.config.Features.Reshaper.orDefault()), keeping references to s.forward, FeatureModePassthrough, FeatureModeHybrid, FeatureModeCRD and orDefault() to locate the code to change.internal/shim/placement/handle_resource_provider_traits_test.go (1)
74-146: Consolidate the Hybrid/CRD tests into a single table-driven test.The two new test functions are byte-for-byte identical except for
FeatureModeHybridvsFeatureModeCRD. Folding them into one table-driven test reduces ~70 lines of duplication and matches the project guideline.♻️ Proposed consolidation
-func TestHandleResourceProviderTraits_HybridMode(t *testing.T) { - down, up := newTestTimers() - s := &Shim{ - config: config{ - PlacementURL: "http://should-not-be-called:1234", - Features: featuresConfig{ResourceProviderTraits: FeatureModeHybrid}, - }, - maxBodyLogSize: 4096, - downstreamRequestTimer: down, - upstreamRequestTimer: up, - } - t.Run("GET returns 501", func(t *testing.T) { /* ... */ }) - t.Run("PUT returns 501", func(t *testing.T) { /* ... */ }) - t.Run("DELETE returns 501", func(t *testing.T) { /* ... */ }) -} - -func TestHandleResourceProviderTraits_CRDMode(t *testing.T) { - // duplicate of the above with FeatureModeCRD -} +func TestHandleResourceProviderTraits_NonPassthroughReturns501(t *testing.T) { + for _, mode := range []FeatureMode{FeatureModeHybrid, FeatureModeCRD} { + t.Run(string(mode), func(t *testing.T) { + down, up := newTestTimers() + s := &Shim{ + config: config{ + PlacementURL: "http://should-not-be-called:1234", + Features: featuresConfig{ResourceProviderTraits: mode}, + }, + maxBodyLogSize: 4096, + downstreamRequestTimer: down, + upstreamRequestTimer: up, + } + cases := []struct { + method string + handler http.HandlerFunc + }{ + {"GET", s.HandleListResourceProviderTraits}, + {"PUT", s.HandleUpdateResourceProviderTraits}, + {"DELETE", s.HandleDeleteResourceProviderTraits}, + } + for _, c := range cases { + t.Run(c.method, func(t *testing.T) { + w := serveHandler(t, c.method, "/resource_providers/{uuid}/traits", + c.handler, "/resource_providers/"+validUUID+"/traits") + if w.Code != http.StatusNotImplemented { + t.Fatalf("status = %d, want %d", w.Code, http.StatusNotImplemented) + } + }) + } + }) + } +}As per coding guidelines: "Use struct-based test cases when applicable, but limit yourself to the most relevant cases".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shim/placement/handle_resource_provider_traits_test.go` around lines 74 - 146, Replace the two nearly identical tests TestHandleResourceProviderTraits_HybridMode and TestHandleResourceProviderTraits_CRDMode with one table-driven test (e.g., TestHandleResourceProviderTraits) that iterates over cases for ResourceProviderTraits modes {FeatureModeHybrid, FeatureModeCRD}; for each case create a Shim with Features.ResourceProviderTraits set from the case (using newTestTimers, maxBodyLogSize, PlacementURL as before) and run subtests for the three methods (s.HandleListResourceProviderTraits, s.HandleUpdateResourceProviderTraits, s.HandleDeleteResourceProviderTraits) calling serveHandler with the same path "/resource_providers/"+validUUID+"/traits" and asserting the response code is http.StatusNotImplemented; remove the two original functions (TestHandleResourceProviderTraits_HybridMode and TestHandleResourceProviderTraits_CRDMode) after consolidation.internal/shim/placement/handle_resource_providers.go (1)
462-618: Extract the shared query-filter pipeline.The filter chain (
uuid,name,member_of,in_tree,required,resources) is duplicated almost verbatim betweenlistResourceProvidersHybrid(Lines 508–531) andlistResourceProvidersCRD(Lines 586–610). Any future filter (e.g.,forbidden_aggregates) will need to be kept in sync in two places. A small helper reduces the divergence risk.♻️ Proposed helper extraction
+// applyHypervisorQueryFilters runs the placement-style query filters against +// the given hypervisor list and returns the filtered slice. Callers handle +// the HTTP error response when a 400 is appropriate. +func applyHypervisorQueryFilters(ctx context.Context, hvs []hv1.Hypervisor, query url.Values) ([]hv1.Hypervisor, error) { + filtered := hvs + if v := query.Get("uuid"); v != "" { + filtered = filterHypervisorsByUUID(ctx, filtered, v) + } + if v := query.Get("name"); v != "" { + filtered = filterHypervisorsByName(ctx, filtered, v) + } + if vals := query["member_of"]; len(vals) > 0 { + filtered = filterHypervisorsByMemberOf(ctx, filtered, vals) + } + if v := query.Get("in_tree"); v != "" { + filtered = filterHypervisorsByInTree(ctx, filtered, v) + } + if vals := query["required"]; len(vals) > 0 { + filtered = filterHypervisorsByRequired(ctx, filtered, vals) + } + if v := query.Get("resources"); v != "" { + var err error + filtered, err = filterHypervisorsByResources(ctx, filtered, v) + if err != nil { + return nil, err + } + } + return filtered, nil +}This requires adding
"net/url"to imports and replacing both filter blocks with a single call that handles the400path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/shim/placement/handle_resource_providers.go` around lines 462 - 618, Extract the duplicated filter pipeline into a helper like applyHypervisorFilters(ctx context.Context, query url.Values, items []hv1.Hypervisor) ([]hv1.Hypervisor, error) that applies filterHypervisorsByUUID, filterHypervisorsByName, filterHypervisorsByMemberOf, filterHypervisorsByInTree, filterHypervisorsByRequired and filterHypervisorsByResources (returning a non-nil error only for invalid resources), add "net/url" to imports, and replace the inline filter blocks in listResourceProvidersHybrid and listResourceProvidersCRD to call this helper; in each caller, if the helper returns an error log it with log.Info and return http.StatusBadRequest with the error message (same behavior as the current resources-path).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/shim/placement/handle_traits.go`:
- Around line 567-584: The sync currently overwrites the entire Helm-managed
ConfigMap by calling s.writeTraits(cm, traitSet) with only upstream body.Traits,
causing churn; modify the logic in the placement sync (around s.Get,
s.writeTraits, s.Update and staticTraitsConfigMapKey) to merge upstream traits
into the existing ConfigMap data instead of replacing it — read existing trait
entries from cm.Data, combine them with traitSet (body.Traits) deduplicating by
name, then call s.writeTraits or an updated writer with the merged set and
Update; alternatively add a configuration flag or comment in docs if upstream
should be authoritative.
- Around line 531-585: syncTraitsFromUpstream issues unauthenticated GETs to the
Placement /traits endpoint which will fail when Keystone auth is required; fix
by either (A) adding a startup-time validation in Shim initialization that
checks hybrid-mode traits and SSO/transport-layer auth presence and logs a
warning/error (reference: syncTraitsFromUpstream, staticTraitsConfigMapKey, Shim
initialization path), or (B) implement service-token retrieval and injection:
add a helper that uses configured OSUsername/OSPassword/KeystoneURL to obtain a
Keystone token, cache/refresh it, and set the X-Auth-Token header on the
http.Request created in syncTraitsFromUpstream before s.httpClient.Do; also
change logging in syncTraitsFromUpstream to escalate repeated non-200 responses
(track consecutive failures and move from Info to Warn/Error after a threshold)
so silent failures become visible (reference: syncTraitToUpstream for
header-forwarding pattern and syncTraitsFromUpstream for where to attach the
token and adjust logging).
---
Outside diff comments:
In `@internal/shim/placement/handle_traits_e2e.go`:
- Around line 20-38: Update the doc comment in handle_traits_e2e.go to remove
the stale enableTraits wording and describe the mode-based gating: replace
mentions of "enableTraits" with references to the Traits feature mode check
(config.Features.Traits.orDefault() == FeatureModePassthrough), e.g. note that
Phase 1 always runs and that GET /traits is forwarded to upstream when the
Traits feature is in passthrough mode (FeatureModePassthrough), and that Phase 2
CRUD steps run only when the Traits feature mode is passthrough/managed as
appropriate—use the symbol config.Features.Traits.orDefault() and
FeatureModePassthrough in the wording so readers see the current runtime check
instead of the removed enableTraits flag.
---
Nitpick comments:
In `@internal/shim/placement/handle_allocations.go`:
- Around line 26-104: The handlers duplicate the same switch logic for feature
modes; add a helper method func (s *Shim) dispatchMode(w http.ResponseWriter, r
*http.Request, m FeatureMode) placed alongside FeatureMode in shim.go that
implements the shared switch (m.orDefault(): FeatureModePassthrough ->
s.forward(w,r); FeatureModeHybrid/FeatureModeCRD -> http.Error 501 with fmt
message using m; default -> http.Error 500). Then update each handler (e.g.,
HandleManageAllocations, HandleListAllocations, HandleUpdateAllocations,
HandleDeleteAllocations) to replace the switch with a single call
s.dispatchMode(w, r, s.config.Features.Allocations), and remove now-unused fmt
imports from the files where applicable.
In `@internal/shim/placement/handle_reshaper.go`:
- Around line 26-35: Extract the duplicated mode-dispatch into a Shim helper
named dispatchPassthroughOnly that accepts (w http.ResponseWriter, r
*http.Request, mode FeatureMode) and implements the 3-way switch: call
s.forward(w,r) for FeatureModePassthrough, return http.StatusNotImplemented with
a message including the resolved mode for FeatureModeHybrid|FeatureModeCRD, and
return http.StatusInternalServerError for default; then replace the switch in
HandlePostReshaper (and the other handlers) with a single call to
s.dispatchPassthroughOnly(w, r, s.config.Features.Reshaper.orDefault()), keeping
references to s.forward, FeatureModePassthrough, FeatureModeHybrid,
FeatureModeCRD and orDefault() to locate the code to change.
In `@internal/shim/placement/handle_resource_provider_traits_test.go`:
- Around line 74-146: Replace the two nearly identical tests
TestHandleResourceProviderTraits_HybridMode and
TestHandleResourceProviderTraits_CRDMode with one table-driven test (e.g.,
TestHandleResourceProviderTraits) that iterates over cases for
ResourceProviderTraits modes {FeatureModeHybrid, FeatureModeCRD}; for each case
create a Shim with Features.ResourceProviderTraits set from the case (using
newTestTimers, maxBodyLogSize, PlacementURL as before) and run subtests for the
three methods (s.HandleListResourceProviderTraits,
s.HandleUpdateResourceProviderTraits, s.HandleDeleteResourceProviderTraits)
calling serveHandler with the same path
"/resource_providers/"+validUUID+"/traits" and asserting the response code is
http.StatusNotImplemented; remove the two original functions
(TestHandleResourceProviderTraits_HybridMode and
TestHandleResourceProviderTraits_CRDMode) after consolidation.
In `@internal/shim/placement/handle_resource_providers_test.go`:
- Around line 886-980: The test and helper still use "flag off" wording but the
code now uses modes; rename TestHandleResourceProviders_FeatureFlagOff to
TestHandleResourceProviders_PassthroughMode (or similar), rename helper
newFlagOffShim to newPassthroughShim, and replace all "flag off" messages (e.g.
"flag off should forward, not 409") with wording that references
Passthrough/FeatureModePassthrough (e.g. "passthrough mode should forward, not
409"); update the comment header and any other identifiers or log strings that
mention "flag off" to use FeatureModePassthrough so names and messages
consistently reflect the mode model and reference the existing
FeatureModePassthrough, TestHandleResourceProviders_FeatureFlagOff, and
newFlagOffShim symbols when making the edits.
In `@internal/shim/placement/handle_resource_providers.go`:
- Around line 462-618: Extract the duplicated filter pipeline into a helper like
applyHypervisorFilters(ctx context.Context, query url.Values, items
[]hv1.Hypervisor) ([]hv1.Hypervisor, error) that applies
filterHypervisorsByUUID, filterHypervisorsByName, filterHypervisorsByMemberOf,
filterHypervisorsByInTree, filterHypervisorsByRequired and
filterHypervisorsByResources (returning a non-nil error only for invalid
resources), add "net/url" to imports, and replace the inline filter blocks in
listResourceProvidersHybrid and listResourceProvidersCRD to call this helper; in
each caller, if the helper returns an error log it with log.Info and return
http.StatusBadRequest with the error message (same behavior as the current
resources-path).
In `@internal/shim/placement/handle_traits_test.go`:
- Around line 68-110: The file still uses outdated section header comments
referencing enableTraits=true/false; update the comment lines above the
passthrough tests and the following handler tests to use the new mode names
(e.g., change the "// --- Passthrough tests (enableTraits=false) ---" comment to
something like "// --- Passthrough tests (mode=passthrough) ---" and the "// ---
Handler tests (enableTraits=true) ---" comment to "// --- Handler tests
(mode=crd) ---") so the sections around TestHandleListTraitsPassthrough,
TestHandleShowTraitPassthrough, TestHandleUpdateTraitPassthrough,
TestHandleDeleteTraitPassthrough and the subsequent handler tests correctly
reflect the current modes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 846369be-930d-445c-a7a5-1cc05fb8e84e
📒 Files selected for processing (24)
docs/architecture.mdhelm/bundles/cortex-placement-shim/templates/configmap-traits.yamlhelm/bundles/cortex-placement-shim/values.yamlinternal/shim/placement/handle_allocation_candidates.gointernal/shim/placement/handle_allocations.gointernal/shim/placement/handle_reshaper.gointernal/shim/placement/handle_resource_classes.gointernal/shim/placement/handle_resource_provider_aggregates.gointernal/shim/placement/handle_resource_provider_allocations.gointernal/shim/placement/handle_resource_provider_inventories.gointernal/shim/placement/handle_resource_provider_traits.gointernal/shim/placement/handle_resource_provider_traits_test.gointernal/shim/placement/handle_resource_provider_usages.gointernal/shim/placement/handle_resource_providers.gointernal/shim/placement/handle_resource_providers_e2e.gointernal/shim/placement/handle_resource_providers_test.gointernal/shim/placement/handle_root.gointernal/shim/placement/handle_root_test.gointernal/shim/placement/handle_traits.gointernal/shim/placement/handle_traits_e2e.gointernal/shim/placement/handle_traits_test.gointernal/shim/placement/handle_usages.gointernal/shim/placement/shim.gointernal/shim/placement/shim_test.go
…ndpoints - Resource provider traits: hybrid mode forwards to upstream (upstream authoritative, operator handles CRD sync); CRD GET serves traits from Hypervisor CRD; CRD PUT/DELETE return 501 (requires operator coordination). - Add hybrid and CRD mode unit tests for all 9 unimplemented endpoint groups (allocation_candidates, allocations, reshaper, resource_classes, resource_provider_aggregates, resource_provider_allocations, resource_provider_inventories, resource_provider_usages, usages) verifying they correctly return 501 Not Implemented.
The background trait sync loop calls GET /traits on upstream Placement without authentication, causing 401 failures. Fix by initializing a gophercloud ProviderClient during startup (with AllowReauth for automatic token renewal) and setting X-Auth-Token on sync requests. When Keystone credentials are not configured the sync continues to make unauthenticated requests as before (best-effort).
Replace manual token management (reading keystoneProvider.TokenID) with a gophercloud ServiceClient that handles auth headers and automatic token renewal (reauth on 401) transparently. Also adds the required OpenStack-API-Version: placement 1.6 header and sets the SSO transport on the provider after Keystone authentication so placement requests use the correct TLS certificates.
- Extract dispatchPassthroughOnly helper to deduplicate the mode switch across 20 passthrough-only handlers (drops ~160 lines) - Extract applyHypervisorQueryFilters to deduplicate the filter pipeline shared between listResourceProvidersHybrid and listResourceProvidersCRD - Rename FeatureFlagOff tests and newFlagOffShim to Passthrough naming - Update stale enableTraits references in doc comments and test headers
Test Coverage ReportTest Coverage 📊: 69.3% |
|
Addressed all nitpick items from CodeRabbit's review in commits c082903 and 6aacc4f:
|
The placement shim used inconsistent boolean flags (EnableResourceProviders, EnableRoot, EnableTraits) that only distinguished between forwarding to upstream and handling locally. This replaces them with a FeatureMode type supporting three modes: passthrough (forward to upstream), hybrid (combine upstream with local CRD data), and crd (operate without upstream). Each of the 11 endpoint groups now has its own mode field in featuresConfig, defaulting to passthrough when unset. For the root endpoint, hybrid mode computes the version intersection between upstream and the local static document. For traits, hybrid mode forwards requests to upstream and runs a periodic sync routine (60s interval with jitter) to keep the local ConfigMap in sync. For resource providers, hybrid retains the existing merge behavior while crd lists from Kubernetes only and returns 404 for non-KVM providers. Endpoints that have not yet implemented hybrid or crd logic return 501 Not Implemented. Helm values, unit tests, and E2E tests are updated accordingly.